New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Moving Cyphal/UDP to multicast #253
[WIP] Moving Cyphal/UDP to multicast #253
Conversation
Question: Is anonymous listening possible or not? There seems to be some contradiction:
vs
|
Question: Does it make sense to move |
There is no contradiction: "The concept of anonymous transfer is not defined for Cyphal/UDP" means that you can't send anonymous transfers, but you can still create anonymous nodes that cannot communicate (they can only listen).
Probably not (yet) because you would have to update other transports dependent on this class. |
Ok, thanks. "Datagram" still needs some work, will ping you when a review is needed. |
6103be2
to
93a7b26
Compare
Does the passing of the |
The changes look good to me. Later on, we may want to enable anonymous transfers for UDP as I wrote on the forum; it should be a cheap change to introduce. The relevant part of the source code can be found by searching for "In Cyphal/UDP, the anonymous mode is somewhat bolted-on." The text and diagrams in the module docstring at Nice work so far ;) |
Added to todo-list 👍 |
Could you review these changes? I just want to make sure that this part is correct ( Also some questions:
|
Nice progress but there seems to be a problem with subject-/service-/node-ID mixup. Please give another look at section 4.1.1 "Transport model" of the Specification. To publish a message on subject S, we send a multicast datagram to the multicast group whose address is computed as: And the destination UDP port is set to 16383. To send a request or response on service X to node N, we send a multicast datagram to the multicast group whose address is computed as: And the destination UDP port is set to (16384 + X * 2 + (is_response)). |
Could you please update your branch to sync up with master? |
6044614
to
829c071
Compare
Could you confirm the changes are correct? I think I have addressed the issues. |
829c071
to
1b0c1c6
Compare
Regarding class IPv4SocketFactory(SocketFactory):
def __init__(self, domain_id: int)
def make_output_socket(
self, remote_node_id: typing.Optional[int], data_specifier: pycyphal.transport.DataSpecifier
) -> socket.socket:
# General setup
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
s.setblocking(False)
s.bind((str(self._local), 0)) # QUESTION: What local IP address bind to? (Does it need to be bound at all?)
# Message
remote_ip = message_data_specifier_to_multicast_group(self._domain_id, data_specifier)
remote_port = SUBJECT_PORT
# Service
remote_ip = service_data_specifier_to_multicast_group(self._domain_id, remote_node_id, data_specifier)
remote_port = service_data_specifier_to_udp_port(data_specifier)
# Connect
s.connect((str(remote_ip), remote_port))
def make_input_socket(
self, remote_node_id: typing.Optional[int], data_specifier: pycyphal.transport.DataSpecifier # CHANGE: need remote_node_id for service
) -> socket.socket:
# General setup
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
s.setblocking(False)
# Message
multicast_ip = message_data_specifier_to_multicast_group(self._domain_id, data_specifier)
multicast_port = SUBJECT_PORT
# Service
multicast_ip = service_data_specifier_to_multicast_group(self._domain_id_, remote_node_id, data_specifier)
multicast_port = service_data_specifier_to_udp_port(data_specifier)
# Bind
s.bind((str(multicast_ip), multicast_port)) Look right? |
658c6c3
to
176275f
Compare
I think it's almost finished. The last change that needs to be addressed, is mainly related to
This seems to suggest that some significant overhaul/simplification can be done here. Can you give some pointers? Some smaller questions:
if ip_destination.is_multicast:
if udp_packet.destination_port == SUBJECT_PORT:
# Message packet
dst_nid = None # Broadcast
data_spec = multicast_group_to_message_data_specifier(ip_source, ip_destination)
else:
# Service packet
data_spec = udp_port_to_service_data_specifier(udp_packet.destination_port)
# QUESTION: Correct to use DOMAIN_ID_MASK here? (or make a seperate function for this?)
domain_id = (int(ip_destination)&DOMAIN_ID_MASK)>>18
dst_nid = service_multicast_group_to_node_id(domain_id, ip_destination)
msg_i = fac.make_input_socket(None, MessageDataSpecifier(612))
test_msg_o.sendto(b"Seagull", ("239.52.2.100", SUBJECT_PORT))
time.sleep(1) ##QUESTION: BlockingIOError: [Errno 35] Resource temporarily unavailable
rx = msg_i.recvfrom(1024)
assert rx[0] == b"Seagull"
assert rx[1][0] == "127.0.0.1" # Same address we just bound to. This extra |
Yes, indeed, as you have correctly guessed, it should be possible to get rid of the SocketReader or at least simplify it. There may be an arbitrary number of sockets connected to the same multicast endpoint and the OS should perform demultiplexing correctly for you. Maybe for now, in the interest of minimizing the scope of this changeset, we should keep the socket reader in place and consider removing it later.
I am not sure what are you going to use the domain-ID here for? It doesn't seem to be needed to parse the frame unless I am missing something.
The kernel works in mysterious ways, what else can I say? Since this is just a test, it is fine to simply keep the sleep in place. |
There is some discussion on the forum that you should be aware of: https://forum.opencyphal.org/t/cyphal-udp-architectural-issues-caused-by-the-dependency-between-the-nodes-ip-address-and-its-identity/1765/41 There's nothing major but it seems like we'll have to shuffle some bits around the header and the IP address, start using one common UDP port number for all traffic and discriminate services based on a dedicated service-ID field in the header instead of UDP ports, and also possibly add a header checksum. All of these changes seem quite minor in comparison to what you've already implemented here. |
First I re-wrote def _dispatch_frame(
self, timestamp: Timestamp, source_ip_address: _IPAddress, frame: typing.Optional[UDPFrame]
) -> None:
# Do not accept datagrams emitted by the local node itself. Do not update the statistics either.
external = self._anonymous or (source_ip_address != self._local_ip_address)
if not external:
return
# Process the datagram. This is where the actual demultiplexing takes place.
# The node-ID mapper will return None for datagrams coming from outside of our Cyphal subnet.
handled = False
source_node_id = None
if frame is not None:
# if source_ip_address is part of our Cyphal subnet
if (DOMAIN_ID_MASK & int(source_ip_address)) == (DOMAIN_ID_MASK & int(self._local_ip_address)):
source_node_id = frame.source_node_id
# if source_ip_address is not part of our Cyphal subnet, source_node_id is None
else:
source_node_id = None Now I'm starting to suspect that this is not how it's meant to be. Instead it should be:
def __init__(
self,
sock: socket.socket,
local_ip_address: _IPAddress,
domain_id: int,
anonymous: bool,
statistics: SocketReaderStatistics,
):
self._domain_id = domain_id
def _dispatch_frame(
self, timestamp: Timestamp, source_ip_address: _IPAddress, frame: typing.Optional[UDPFrame]
) -> None:
# Do not accept datagrams emitted by the local node itself. Do not update the statistics either.
external = self._anonymous or (source_ip_address != self._local_ip_address)
if not external:
return
# Process the datagram. This is where the actual demultiplexing takes place.
# The node-ID mapper will return None for datagrams coming from outside of our Cyphal subnet.
handled = False
source_node_id = None
if frame is not None:
# if source_ip_address is part of our Cyphal subnet
if self._domain_id == (DOMAIN_ID_MASK & int(source_ip_address)):
source_node_id = frame.source_node_id
# if source_ip_address is not part of our Cyphal subnet, source_node_id is None
else:
source_node_id = None I'm not sure if Note to self: replace subnet with domain_id, to avoid further confusion. |
You are correct in suspecting this!
In this new design, unicast IP addresses are no longer relevant at all. Any node can operate on any domain-ID with any node-ID regardless of its identity on the IP layer. Parameters like the |
Can you check this Main changes:
# Old
accepted_datagrams: typing.Dict[int, int] = dataclasses.field(default_factory=dict)
dropped_datagrams: typing.Dict[typing.Union[_IPAddress, int], int] = dataclasses.field(default_factory=dict)
# New
accepted_datagrams: typing.Dict[typing.Optional[int], int] = dataclasses.field(default_factory=dict)
dropped_datagrams: typing.Dict[typing.Optional[int], int] = dataclasses.field(default_factory=dict)
# Keys:
# None: Invalid node-ID (for dropped_datagrams)
# None: anonymous frame (for accepted_datagrams)
# Int: node-ID Concerning the keys: use (Ignore the Unit tests |
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
Co-authored-by: Pavel Kirienko <pavel.kirienko@gmail.com>
This is easy to fix: make UDPTransport accept Aside from that (and a few related type errors), the only remaining issue is that of the statistics. #279 |
Please also bump the minor version number and add a new section to the changelog. |
import copy
This MR is based on this proposol discussed in the OpenCyphal forum.
In short, the changes can be broken down into three pieces:
1. Datagram header format
Current:
Proposal:
Note:
version
will be bumped to 1, however no backward-compatibility changes are made (since the protocol is still in development).2. Message
Current:
Proposal:
3. Service
Current: regular unicast
Proposal:
TODO